Pronunciation Proficiency Estimation Based on Multilayer Regression Analysis Using Speaker-independent Structural Features
نویسندگان
چکیده
Teachers can assess the pronunciations of students independently of extra-linguistic features such as age and gender observed in the students’ utterances. This capacity is, however, difficult to realize on machines because linguistic differences and extra-linguistic differences change acoustic features commonly. Therefore, the performance of automatic pronunciation assessment is inevitably affected by the extra-linguistic features. Recently, we proposed acoustic features that are independent of extra-linguistic factors, called structural features and realized a technique for pronunciation proficiency estimation that is extremely robust to these factors. In this paper, we extend this technique with multilayer regression analysis, where supervised learning is done at each layer by using teachers’ scores of that layer. Experiments of estimating the proficiency show that higher correlations between teachers and machines are obtained compared to our previous structure-based assessment.
منابع مشابه
Improved structure-based automatic estimation of pronunciation proficiency
Automatic estimation of pronunciation proficiency has its specific difficulty. Adequacy in controlling the vocal organs is often estimated from spectral envelopes of input utterances but the envelope patterns are also affected by alternating speakers. To develop a good and stable method for automatic estimation, the envelope changes caused by linguistic factors and those by extra-linguistic fac...
متن کاملIntegration of multilayer regression analysis with structure-based pronunciation assessment
Automatic pronunciation assessment has several difficulties. Adequacy in controlling the vocal organs is often estimated from the spectral envelopes of input utterances but the envelope patterns are also affected by other factors such as speaker identity. Recently, a new method of speech representation was proposed where these non-linguistic variations are effectively removed through modeling o...
متن کاملPronunciation assessment based on multilayer multiple regression analysis using structural features
In the rapid internationalization and informatization, many research efforts have been made to build computer-aided language learning (CALL) systems. Good pronunciation assessment systems should be built using the technologies which can deal with acoustic variabilities found in learners’ utterances caused by non-linguistic factors such as age and gender. However, the widely-used acoustic modeli...
متن کاملAutomatic Characterisation of the Pronunciation of Non-native English Speakers using Phone Distance Features
The distances between and relative movements of phones in acoustic space in language learners have been shown to be indicative of the speaker’s proficiency, in a way that is compact and independent of bias-inducing voice qualities. Typically these features are based on known transcriptions, ”read aloud” style tasks. This paper examines the information that can be extracted about speakers from p...
متن کاملMultiple-pronunciation lexical modeling in a speaker independent speech understanding system
One of the sources of difficulty in speech recognition and understanding is the variability due to alternate pronunciations of words. To address the issue we have investigated the use of multiple-pronunciation models (MPMs) in the decoding stage of a speaker-independent speech understanding system. In this paper we address three important issues regarding MPMs: (a) Model construction: How can M...
متن کامل